feat: add @agentic-db/documents-loader package and CLI docs command#37
Merged
pyramation merged 3 commits intomainfrom Apr 30, 2026
Merged
feat: add @agentic-db/documents-loader package and CLI docs command#37pyramation merged 3 commits intomainfrom
pyramation merged 3 commits intomainfrom
Conversation
- New package: @agentic-db/documents-loader for importing/exporting text-based files (md, mdx, txt, rst, html, yaml, json, csv, etc.) into the documents table - Parser with frontmatter extraction for markdown/mdx files - Directory scanner with configurable extension and ignore filters - Importer with last-write-wins conflict resolution (upsert by repo_name + file_path) - Exporter with optional frontmatter generation - SDK client adapter using duck-typed interfaces for compatibility - CLI: agentic-db docs import/export/list commands - 52 tests covering parser, scanner, importer, exporter, and roundtrip scenarios
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
No dependency changes detected. Learn more about Socket for GitHub. 👍 No dependency changes detected in pull request |
The CLI now depends on @agentic-db/documents-loader, which needs to be built before the e2e tests can resolve it via tsx.
Test Results: @agentic-db/documents-loaderTesting approach: Shell-based testing against built package with temp directories and mock clients (no DB credentials available). All 8 tests passed (86 total assertions)
Adversarial edge cases tested
Roundtrip & conflict resolution
Not tested (no DB access)
|
- Add self-contained gitignore parser (no external deps) with 22 tests - Scanner now reads .gitignore files (root + nested) and skips ignored paths - Add skipGitignore option to ScanOptions for opting out - Export gitignore utilities from package index for potential upstream to dev-utils - Add documents-loader-tests CI job (no database required)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new
@agentic-db/documents-loaderpackage for importing/exporting text-based files into thedocumentstable, plus CLI commands (agentic-db docs import/export/list).New package:
packages/documents-loader/.md/.mdx, plain text for other formats.md,.mdx,.txt,.rst,.html,.xml,.json,.yaml,.yml,.csv,.tsv), ignoresnode_modules/.git/etc. Automatically parses.gitignorefiles (root + nested) to skip ignored paths**, negation, directory-only, character classes, anchored patterns). Suitable for upstreaming to dev-utilsrepo_name + file_path. Last-write-wins conflict resolution. Supports dry-run, progress callbacks, tag merging, commit hash trackingrepo_name, writes to disk preserving directory structure, optional frontmatter generationCLI expansion (
sdk/cli/)docscommand withimport,export, andlistsubcommandssearch,ask,embed,configinquirererwhen args are omittedCI
documents-loader-testsjob (no database required)Build documents-loaderstep before cli-e2e-testsEmbeddings: handled automatically — the DB's existing triggers set
embedding_stale = trueon create/update, and the worker picks it up.Review & Testing Checklist for Human
cd packages/documents-loader && pnpm test— all 74 tests should passsrc/gitignore.ts) for potential upstream to dev-utilssdk-client.tsduck-typed interface — verify it matches the actual SDK API shape fordocument.findFirst/findMany/create/update/deleteagentic-db docs import ./some-dir --repo test-repoagainst a real database to confirm end-to-end flow.mdwith frontmatter, export it, check content is preservedNotes
makage build,publishConfig.directory: "dist",workspace:*deps.eslintrc.json(legacy) — individual packagelintscripts needESLINT_USE_FLAT_CONFIG=falsedue to ESLint v9docsCLI command is added to the custom commands map (renamed fromragCommandstocustomCommandsto reflect the broader scope)ignorenpm dep) so it can be upstreamed to dev-utils laterLink to Devin session: https://app.devin.ai/sessions/e249b6a02652412c8484e5b00fc955dd
Requested by: @pyramation